Search CORE

90 research outputs found

Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization

Author: F Caba Heilbron
GA Sigurdsson
P Mettes
X Peng
Publication venue
Publication date: 27/07/2018
Field of study

State-of-the-art temporal action detectors inefficiently search the entire video for specific actions. Despite the encouraging progress these methods achieve, it is crucial to design automated approaches that only explore parts of the video which are the most relevant to the actions being searched for. To address this need, we propose the new problem of action spotting in video, which we define as finding a specific action in a video while observing a small portion of that video. Inspired by the observation that humans are extremely efficient and accurate in spotting and finding action instances in video, we propose Action Search, a novel Recurrent Neural Network approach that mimics the way humans spot actions. Moreover, to address the absence of data recording the behavior of human annotators, we put forward the Human Searches dataset, which compiles the search sequences employed by human annotators spotting actions in the AVA and THUMOS14 datasets. We consider temporal action localization as an application of the action spotting problem. Experiments on the THUMOS14 dataset reveal that our model is not only able to explore the video efficiently (observing on average 17.3% of the video) but it also accurately finds human activities with 30.8% mAP.Comment: Accepted to ECCV 201

arXiv.org e-Print Archive

Crossref

Are 3D convolutional networks inherently biased towards appearance?

Author: Byvshev P.
Mettes P.
Xiao Y.
Publication venue: 'Elsevier BV'
Publication date: 01/07/2022
Field of study

International Migration, Integration and Social Cohesion online publications

Few-Shot Transformation of Common Actions into Time and Space

Author: Mettes P.
Snoek C.G.M.
Yang P.
Publication venue: Conference Publishing Services, IEEE Computer Society
Publication date: 01/01/2021
Field of study

This paper introduces the task of few-shot common action localization in time and space. Given a few trimmed support videos containing the same but unknown action, we strive for spatio-temporal localization of that action in a long untrimmed query video. We do not require any class labels, interval bounds, or bounding boxes. To address this challenging task, we introduce a novel few-shot transformer architecture with a dedicated encoder-decoder structure optimized for joint commonality learning and localization prediction, without the need for proposals. Experiments on our reorganizations of the AVA and UCF101-24 datasets show the effectiveness of our approach for few-shot common action localization, even when the support videos are noisy. Although we are not specifically designed for common localization in time only, we also compare favorably against the few-shot and one-shot state-of-the-art in this setting. Lastly, we demonstrate that the few-shot transformer is easily extended to common action localization per pixel

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Few-Shot Transformation of Common Actions into Time and Space

Author: Mettes P.
Snoek C.G.M.
Yang P.
Publication venue: Conference Publishing Services, IEEE Computer Society
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications

Object priors for classifying and localizing unseen actions

Author: Mettes P.
Snoek C.G.M.
Thong W.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2021
Field of study

International Migration, Integration and Social Cohesion online publications

Localizing the Common Action Among a Few Videos

Author: Hu V.T.
Mettes P.
Snoek C.G.M.
Yang P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

International Migration, Integration and Social Cohesion online publications

Counting with Focus for Free

Author: Mettes P.
Shi Z.
Snoek C.G.M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

International Migration, Integration and Social Cohesion online publications

Object priors for classifying and localizing unseen actions

Author: Mettes P.
Snoek C.G.M.
Thong W.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/04/2021
Field of study

This work strives for the classification and localization of human actions in videos, without the need for any labeled video training examples. Where existing work relies on transferring global attribute or object information from seen to unseen action videos, we seek to classify and spatio-temporally localize unseen actions in videos from image-based object information only. We propose three spatial object priors, which encode local person and object detectors along with their spatial relations. On top we introduce three semantic object priors, which extend semantic matching through word embeddings with three simple functions that tackle semantic ambiguity, object discrimination, and object naming. A video embedding combines the spatial and semantic object priors. It enables us to introduce a new video retrieval task that retrieves action tubes in video collections based on user-specified objects, spatial relations, and object size. Experimental evaluation on five action datasets shows the importance of spatial and semantic object priors for unseen actions. We find that persons and objects have preferred spatial relations that benefit unseen action localization, while using multiple languages and simple object filtering directly improves semantic matching, leading to state-of-the-art results for both unseen action classification and localization.Comment: Accepted to IJC

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Searching for Actions on the Hyperbole

Author: Long T.
Mettes P.
Shen H.T.
Snoek C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

International Migration, Integration and Social Cohesion online publications

On Measuring and Controlling the Spectral Bias of the Deep Image Prior

Author: Maji S.
Mettes P.
Shi Z.
Snoek C.G.M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2022
Field of study

International Migration, Integration and Social Cohesion online publications